Complex Function Sets Improve Symbolic Discriminant Analysis of Microarray Data
نویسندگان
چکیده
Our ability to simultaneously measure the expression levels of thousands of genes in biological samples is providing important new opportunities for improving the diagnosis, prevention, and treatment of common diseases. However, new technologies such as DNA microarrays are generating new challenges for variable selection and statistical modeling. In response to these challenges, a genetic programming-based strategy called symbolic discriminant analysis (SDA) for the automatic selection of gene expression variables and mathematical functions for statistical modeling of clinical endpoints has been developed. The initial development and evaluation of SDA has focused on a function set consisting of only the four basic arithmetic operators. The goal of the present study is to evaluate whether adding more complex operators such as square root to the function set improves SDA modeling of microarray data. The results presented in this paper demonstrate that adding complex functions to the terminal set significantly improves SDA modeling by reducing model size and, in some cases, reducing classification error and runtime. We anticipate SDA will be an important new evolutionary computation tool to be added to the repertoire of methods for the analysis of microarray data.
منابع مشابه
Incorporating canonical discriminant attributes in classification learning
This paper describes a method for incorporating canonical discriminant attributes in classification machine learning. Though decision trees and rules have semantic appeal when building expert systems, the merits of discriminant analysis are well documented. For data sets on which discriminant analysis obtains significantly better predictive accuracy than symbolic machine learning, the incorpora...
متن کاملPdmclass Function to Classify Microarray Data Using Penalized Discriminant Methods
Description This function is used to classify microarray data. Since the underlying model fit is based on penalized discriminant methods, there is no need for a pre-filtering step to reduce the number of genes. Usage pdmClass(formula , method = c("pls", "pcr", "ridge"), keep.fitted = Arguments formula A symbolic description of the model to be fit. Details given below. method One of "pls", "pcr"...
متن کاملHybrid Filter-Wrapper with a Specialized Random Multi-Parent Crossover Operator for Gene Selection and Classification Problems
The microarray data classification problem is a recent complex pattern recognition problem. The most important goal in supervised classification of microarray data, is to select a small number of relevant genes from the initial data in order to obtain high predictive classification accuracy. With the framework of a hybrid filter-wrapper, we study in this paper the role of the multi-parent recom...
متن کاملGlobal gene expression analysis using microarray to study differential vulnerability to neurodegeneration
Neurodegenerative disorders such as Parkinson’s disease, motor neuron disease and Alzheimer’s disease is characterized by loss of specific cells within certain regions of the brain. One of the most compelling questions is to determine why specific cell populations are vulnerable to neurodegeneration. We addressed this question by studying global gene expression changes using an animal model of ...
متن کاملGlobal gene expression analysis using microarray to study differential vulnerability to neurodegeneration
Neurodegenerative disorders such as Parkinson’s disease, motor neuron disease and Alzheimer’s disease is characterized by loss of specific cells within certain regions of the brain. One of the most compelling questions is to determine why specific cell populations are vulnerable to neurodegeneration. We addressed this question by studying global gene expression changes using an animal model of ...
متن کامل